How to Find the top N most frequent words in a large text file using PySpark

Spark NLP Pipelines

AHKCon 5: RegEx & Parslng Strings with StirSplit etc.

Big Data Master's Program -PySpark RDD Session-4

Batch 6 Session 1 : Part 3 (Basic and Bit Advanced Boolean Operators)

Python2-FA20-Session1: Course overview and data types

Quick to Production with the Best of Both Apache Spark and Tensorflow on Databricks

Introduction to Apache Spark on SHARCNET

Apache Spark 2 - Spark Architecture and Execution Modes

Powering TensorFlow with big data (Apache BEAM, Flink, and Spark)

OpenShift Commons Big Data SIG #3: Building Cloud Native Apache Spark Applications with OpenShift

Python for Big Data Analytics - 1 | Python Hadoop Tutorial for Beginners | Python Tutorial | Edureka

HIve and Cassandra

Big Data 2025 Lecture 3: Finding Similar Items

Solving HackerRank AI Challenges -- episode 2

Week 4 SEC | Lab 3: PySpark

Data Engineer's Lunch #3: Scripting / Shell Automation for Data Engineering

Webinar on Social Network Analysis | Hashtag Analysis | Twitter Analysis | NLP | Pantech Live

Graph Neural Networks: A Basic Exploration

PyData Berlin May Live Stream

I don't like notebooks.- Joel Grus (Allen Institute for Artificial Intelligence)

NLP Series III: Text Retrieval Embeddings, E5. Algorithm and code read.

sasavd01_20180928

“Обзор Spark MLLib”(Part II) Ihor Bobak

AI Weekly News #8 October 13th, 2019